8
2
| |
4
|
Start with tabulated blast output myfile.blast.out. Then check two-liners from: http://bergman-lab.blogspot.com/2009/12/ncbi-blast-tabular-output-format-fields.html Few lines tooutput proper gff are missing, but you may either go for minimalistic gff or try to encode everything in column 9. Also you may try validating your gff3 here: http://modencode.oicr.on.ca/cgi-bin/validate_gff3_online NOTE: The blog linked above does not seem to exist anymore, here is the content of it from the wayback machine: NCBI Blast Tabular output format fields Certainly, with the new NCBI Blast+ tools, you won't need this anymore, but as long as we are sticking with the old blastall programm with its horrible documentation, I keep forgetting the format of the BLAST tabular reports. Tabular format is created when you specify " So here is the meaning of the fields:
Parsing is then simple Python:
Perl:
| ||||||||
|
3
|
I found this via google: http://jperl.googlecode.com/svn-history/r16/trunk/Blast2Gff.pl else I would save my blast result as XML and transform it to GFF with with a (should be) simple XSLT stylesheet. As an example, you can have a look at my 'old' stylesheet blast2svg: http://code.google.com/p/lindenb/source/browse/trunk/src/xsl/blast2svg.xsl Pierre
| |||||||||||||||
|
3
|
You can use the script Pierre found swith a slight modification, actually it is a bit crude and does no real error checking but it works. The error is it does not work if the blast file has a header like this:
So, one should filter out lines beginning with "#" and it does no harm to skip lines which are empty or contain only white spaces. So edit the file Blast2Gff.pl: in line 149 add:
Such that this part looks like below, then try again.
| |||
1
|
Have you tried these scripts: http://gmod.org/wiki/Load_BLAST_Into_Chado, http://www.bioperl.org/pipermail/bioperl-l/2002-November/010223.html ?? maybe the PSL format is better to represent an alignment. You can also look at the BED format so later you can play with BedTools | |||
|